Query-driven iterated neighborhood graph search for scalable visual indexing
نویسندگان
چکیده
In this paper, we address the approximate nearest neighbor (ANN) search problem over large scale visual descriptors. We investigate a simple but very effective approach, neighborhood graph (NG) search, which conducts the local search by expanding neighborhoods with a best-first manner. Expanding neighborhood makes it efficient to locate the descriptors with high probability being true NNs. However, it suffers from the local characteristics, and often gets sub-optimal solutions, or conducts exhaustive and continuous neighborhood expansion to find better solutions which deteriorates the query efficiency. We propose a query-driven iterated neighborhood graph search approach to improve the performance. We follow the iterated local search (ILS) strategy widely-used for combinatorial and discrete optimization in operation research, and handle the key issue in applying ILS for neighborhood graph search, Perturbation, which generates a new pivot to restart a local search. The key novelties lie in two-fold: (1) defining the local solution of ANN search over neighborhood graph; (2) presenting a query and search history driven perturbation scheme to generate pivots to restart a new local search. The main benefit from them is avoiding unnecessary neighborhood expansion and hence more efficiently finding true NNs. Experimental results on large scale SIFT indexing and similar image search with tiny images show that our approach performs much better than previous state-of-the-art ANN search approaches.
منابع مشابه
On Graph Query Optimization in Large Networks
The dramatic proliferation of sophisticated networks has resulted in a growing need for supporting effective querying and mining methods over such large-scale graph-structured data. At the core of many advanced network operations lies a common and critical graph query primitive: how to search graph structures efficiently within a large network? Unfortunately, the graph query is hard due to the ...
متن کاملQuery-Driven Indexing in Large-Scale Distributed Systems
Efficient and effective search in large-scale data repositories requires complex indexing solutions deployed on a large number of servers. Web search engines such as Google and Yahoo! already rely upon complex systems to be able to return relevant query results and keep processing times within the comfortable sub-second limit. Nevertheless, the exponential growth of the amount of content on the...
متن کاملDSI: A Method for Indexing Large Graphs Using Distance Set
Recent years we have witnessed a great increase in modeling data as large graphs in multiple domains, such as XML, the semantic web, social network. In these circumstances, researchers are interested in querying the large graph like that: Given a large graph G, and a query Q, we report all the matches of Q in G. Since subgraph isomorphism checking is proved to be NP-Complete[1], it is infeasibl...
متن کاملGiS: Fast Indexing and Querying of Graph Structures
We propose a new way of indexing a large database of graphs and processing exact subgraph matching (or subgraph isomorphism) and approximate (full) graph matching queries. Rather that decomposing a graph into smaller units (e.g., paths, trees, graphs) for indexing purposes, we represent each graph in the database by its graph signature, which is essentially a multiset, and each signature is the...
متن کاملScalable Image Annotation by Summarizing Training Samples into Labeled Prototypes
By increasing the number of images, it is essential to provide fast search methods and intelligent filtering of images. To handle images in large datasets, some relevant tags are assigned to each image to for describing its content. Automatic Image Annotation (AIA) aims to automatically assign a group of keywords to an image based on visual content of the image. AIA frameworks have two main sta...
متن کامل